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Abstract 

< 

The recently introduced backward Monte-Carlo method [Johan Carlsson, 
arXiv:math.NA/0010118] is validated, benchmarked, and compared to the con- 
ventional, forward Monte-Carlo method by analyzing the error in the Monte-Carlo 
solutions to a simple model equation. In particular, it is shown how the backward 
method reduces the statistical error in the common case where the solution is of 
interest in only a small part of phase space. The forward method requires binning of 
particles, and linear interpolation between the bins introduces an additional error. 
Reducing this error by decreasing the bin size increases the statistical error. The 
backward method is not afflicted by this conflict. Finally, it is shown how the poor 
time convergence can be improved for the backward method by a minor modifi- 
cation of the Monte-Carlo equation of motion that governs the stochastic particle 
trajectories. This scheme does not work for the conventional, forward method. 
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1 Introduction 



The simplest parabolic equation is the canonical diffusion equation, 

df d df 
at ax ax 
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with some initial condition f(x, 0) = <&(x). The conventional, forward, weighted 
Monte-Carlo solution to Eq. (1) is given by 



N 



(2) 



i=l 



where the stochastic variables Xf*(T) are found by following the stochastic 
trajectories given by the forward Monte-Carlo difference equation of motion: 



X^(t + At) = X^(t) + fiAt + C<rV At , 



N 



(3) 



where \i = dD/dx, a = V2D, and ( is a zero-mean, unit- variance Gaussian 
random number, ( G N(0, 1). The points where the particles are launched, 
X~*(0), must be chosen so that f-+(x,T = 0), as given by Eq. (2), approximates 
$(x). The algorithm is illustrated in Fig. 1 below. 

The recently introduced backward Monte-Carlo method [1] is both strikingly 
similar and fundamentally different from its older, forward sibling. The back- 
ward solution to Eq. (1) is 



N 



(4) 



i=i 



where the stochastic variables X/~(0) are found by following the stochastic 
trajectories given by the backward Monte-Carlo difference equation of motion: 

Xr(t-At) =Xr(t) +f iAt + (aVAt , Xr(T)=x,i = l,...,N . (5) 



&(X?(0)) 




t=T 



Fig. 1. The conventional, forward, weighted Monte-Carlo method. The trajectories 
as given by the Monte-Carlo equation of motion (3). 
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$m(0))<s>(xr(0)) 



t=T 




Fig. 2. The backward Monte-Carlo method. The trajectories as given by the back- 
ward Monte-Carlo equation of motion (5). 

Comparing the forward and backward solutions, Eqs. (2) and (4), respectively, 
the appearance of (^-functions in the forward solution is the most striking dif- 
ference. To understand why the backward method produces a smooth solution, 
i.e. without 5-functions, we have to go back to its derivation [1], which relies 
heavily on the Feynman-Kac formula [2]. In fact, Eq. (4), supplemented by 
Eq. (5), is just the numerical approximation of the stochastic Feynman-Kac 
representation of Eq. (1). The Feynman-Kac formula relates a parabolic equa- 
tion to its naturally associated stochastic differential equation (SDE). This 
SDE governs microscopic motion going backward in time, and Eq. (5) is its dif- 
ference approximation. As a consequence of time running in different directions 
in the macroscopic and microscopic world, we get the natural initial condition, 
X^~(T) = x in Eq. (5); i.e. we launch the particles from the point in extended 
phase space where we want to know the solution to the parabolic equation 
and let them work their way backward in time to sample from the initial con- 
dition $(JQ~(0)) [see Fig. 2 above]. As is shown in appendix A, the forward 
Monte-Carlo method can be derived by forcing the naturally associated SDE 
to govern microscopic motion going forward in time. This coercion is responsi- 
ble for both the (^-functions in the forward Monte-Carlo solution, Eq. (2), and 
the "fuzzy initial condition", Xf(0) e {X~* (0) | /_ > (x,T = 0) « $(x)}, im- 
posed on the forward Monte-Carlo equation of motion, Eq. (3) [see appendix A 
for details]. 



However, the backward algorithm was not developed for esthetic reasons. The 
motivation originally came from the frustration over the extremely inefficient 
use of test particles of the forward method in cases where only the solution 
in a small part of phase space contributes significantly to the physics of in- 
terest. An example of this type of problem is the heating of fusion plasmas 
by cyclotron-resonance absorption on ions, where only the high-energy tail 



3 



of the solution is of relevance; but most test particles are wasted on the al- 
most Maxwellian bulk, which has been simulated with e.g. the FIDO forward 
Monte-Carlo code [3]. Earlier claims [1] that the backward method is vastly 
superior in these cases will be substantiated in section 3. 

Before that, the sources of numerical error, for both the forward and the back- 
ward method, will be identified and briefly discussed in the next section below. 
In section 3 we will compare the forward and backward Monte-Carlo solutions 
to a simple model equation. In section 4, it will be shown how a minor modi- 
fication of the backward Monte-Carlo difference equation of motion, Eq. (5), 
leads to improved time convergence of the backward solution, Eq. (4). Finally, 
a summary of the main findings of this article follows in section 5. 



2 Sources of error 

The greatest disadvantage of the Monte-Carlo method is the unavoidable sta- 
tistical error caused by the use of random numbers. With the conventional, 
forward Monte-Carlo method, the (^-functions in the solution [see Eq. (2)] make 
binning of the particles necessary. With the number of particles in each bin 
proportional to N/Nun, the statistical error becomes 0{{N /N^)' 1 ' 2 ) with 
obvious notations. Backing out the solution by linear interpolation between 
the bins introduces an error O(N^). 

As was mentioned previously, the backward method was developed for cases 
where the solution is of interest only in a point or small part of phase space. 
In such situations it reduces the statistical error to 0(N~ 1 ^ 2 ). Even in cases 
where the backward method is used to calculate global solutions, and the sta- 
tistical error becomes (9((iV/iVbi n ) _1 / 2 ), it does not have a finite-bin-size error 
because the absence of 5-functions in the solution [see Eq. (4)] makes binning 
superfluous. 

The finite time step, At, also introduces an error term. An analysis of this 
time-step error requires some care; some mathematical subtleties are involved. 
There are two ways through which the time-step error can enter the numerical 
solution. 

First, directly through the numerical approximation of the Feynman-Kac for- 
mula itself. I.e, even if we knew the exact stochastic variables Aj(0), the back- 
ward solution, Eq. (4), would have an error due to the finite time step. To 
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estimate the magnitude of this error, we need to go back to the derivation 
of Ito's formula in Appendix A of Ref. [1] where terms of order 0(At 3 ^ 2 ) 
and higher were neglected. However, in the same way that the (9(At 1//2 )-term, 
which was kept, becomes zero after averaging, so does the 0(At 3//2 )-term. So, 
even if the X^~(0) were known to all orders, X^~(0) = Xj(0), the backward 
solution Eq. (4), would have an error 0(At 2 ). The error of the forward solu- 
tion, Eq. (4) would of course be of the same order. 

The other way in which the finite time step introduces an error into the so- 
lution is through an error in X^~(0). Again, going back to Ref. [1], we see 
that the drift term, fx At, on the right-hand side of the backward Monte-Carlo 
equation of motion, Eq. (5), has an error (9 (At 3 / 2 ), and the diffusive term, 
(ay/At, has an error O (At). The different character of these two error terms 
should be noted; the 0(At 3//2 )-term is deterministic whereas the (9(At)-term 
is stochastic. It can be shown (see e.g. Ref. [5]) that after averaging over all 
the Xj~(Q), the approximation Eq. (5) introduces an O(At) error into the 
backward solution. Again, the situation is equivalent for the forward method. 



In section 4 we will discuss how the time-step error of the backward solution 
can be reduced. But before that, we will study the backward and forward 
solutions to a simple model equation in the next section. 



3 Validation and benchmarking 



To validate and benchmark the backward method and to compare it to the 
conventional forward method, we will solve a simple model equation that has 
an analytic solution. We have chosen the Lorentz equation that has been used 
to model pitch-angle scattering in plasmas [6]: 

f = | fl I' -1***1. < 6 > 

where the diffusion coefficient D = (1 + x)(l — x) and the initial condition is 
f(x, 0) = Y,Zo(M/ 2 ) p i( x o)Pe(x)e' £{£+1)To , where P t {x) is the Legendre poly- 
nomial. The analytic solution is f(x, T) = EZo(£+ l / 2 ) p i( x o)Pe(x)e~ e{e+1)iTo+T) . 



In the following we will use the parameters: x = —0.9, T = 0.1, and T = 0.1; 
while N, Nun, and At will be varied so that their impact on the error can be 
studied. Before we get into a detailed analysis of accuracy and convergence, 
we show the solutions (with N = 2 x 10 5 , iV bin = 20, and At = 10~ 2 ) to 
the Lorentz equation (6) in Fig. 3 below. Solving for f(x,T) over the whole 
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N = 2x1.0 5 , N bin = 


20, At= 1 0" 2 







-1 .0 



1 .0 



0.9£ 



1 .00 



Fig. 3. Black line is analytic solution, red line is backward, and blue line is forward 
Monte-Carlo solution. Global solutions are to the left and tail solutions to the right. 

interval — 1 < x < 1, both the backward, f<-(x,T), and the forward, f^(x,T) 
Monte-Carlo solutions are indistinguishable from the exact solution. However, 
solving only in the subinterval 0.98 < x < 1, the backward solution is a much 
better approximation of the exact solution due to its much smaller statistical 
error. Note especially that the forward solution drops to zero for x > 0.993 
because not a single particle finds its way into the last few bins [compare also 
with the sketch in Fig. 1]. The ability to focus in on a small subinterval and 
calculate efficiently (i.e. without wasting the vast majority of the particles) 
the local solution is one of the main strengths of the backward Monte-Carlo 
method. 



For a detailed error analysis we need a well-defined measure of the error. We 
will use the local relative error e^± = \ f^±(xo,T) — f(xo,T)\ / f(xo,T). In the 
following we will plot the logarithm of the error against the logarithm of one of 
the parameters N, iVbi n , and At, while keeping the other two parameters fixed. 
First we compare the statistical error of the forward and the backward solu- 
tions. To single out the statistical error from the finite bin-size and time-step 
errors we make the latter two small by choosing iV^n = 20, and At = 10 -4 . 
With N spanning six decades (N = 1,2,5,10,20,50,... ,10 6 ), we plot the 
error in the left frame of Fig. 4 below. The dotted lines are least-square fits 
whose slopes approximate the exponents that determine the scaling of the er- 
ror. With the forward method, the bins adjacent to Xq are empty for small N 
(N = 1, 2, 20). As a result, £_> = 1 for these values of N, and the slope becomes 
flat. The low- A" data points have thus been excluded from the least-square fit 
to loge^. As can be seen, the statistical error [(^(A^ 1 / 2 ) with Abi n fixed] then 
scales as predicted. The exponents -0.43 and -0.41 are as close to the theo- 
retical value of -1/2 as could be expected, given the fact that the other small 
but non-zero error terms always tend to flatten the slope. The forward and 
backward methods thus converge at the same rate as A is increased, but the 
forward method needs iVbm times more particles to achieve the same accuracy 
as the backward method. This can easily translate into orders of magnitude 
in terms of execution time. 
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0.0 i Kl 6.0 O.O | Kl 4.0 

log N log N bin 

Fig. 4. Scaling of the error in the forward (blue) and backward (red) solutions. The 
error as a function of the number of particles (N) is plotted in the left frame, and 
as a function of the number of bins (-/Vbin) to the right. 



Next, we study the bin-size error (and the statistical error) by varying A^in 
over four decades (iVbm = 1, 2, 5, . . . , 10 4 ) while keeping fixed iV = 10 4 and 
At = 10~ 4 . The result is shown in the right frame of Fig. 4 above. The first 
observation is that the backward solution is completely unaffected by the bin 
size; is constant and the slope is zero. The error in the forward solution, 
e^, exhibits a more interesting behavior with different scalings for small and 
large A^m- With few bins there are many particles in each bin, so the sta- 
tistical error is small. The bin-size error, O(N^), caused by backing out the 
solution by linear interpolation between these huge bins, however, becomes 
large. For small A^m, the slope of iog£_> is -1.90, in excellent agreement with 
the theoretical value of -2. With many bins, the bin-size error becomes small, 

1 /2 

but with few particles in each bin, the statistical error [0(N£r) with N fixed] 
becomes dominant as is evidenced by the slope (0.48 < 1/2) for large A^m- 
The shape of the ioge_^ graph illustrates the conflict between resolution and 
statistics that is inherent for the forward Monte-Carlo method. The backward 
solution, however, can be calculated in two points arbitrarily close without 
affecting the statistical error. This is yet another of the main strengths of the 
backward method. 

Turning now to the time-step error, we fix N = 2 x 10 5 and Nun = 20 and vary 
At = 10~ 5 , 2 x 10~ 5 , 5 x 10" 5 , . . . , 10 _1 . As can be seen in Fig. 5 below, both 
the forward and the backward methods converge as predicted (the respective 
slopes are < 1) down to a time step At « 10~ 3 . At this point the statistical 
error becomes dominant. 
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Fig. 5. The error as a function of the time step (At) with the forward (blue) and 
backward (red) methods. 

4 A higher-order backward method 



As was discussed in section 2, the dominant O (At) time-step error in the 
Monte-Carlo equations of motion, Eqs. (3) and (5), comes from the diffusive 
term Qa\fKt. There is thus reason to suspect that a higher-order diffusive 
term might result in better overall time convergence, by reducing the time- 
step error to 0(At 3 ^ 2 ). In appendix B it is shown that the next-higher-order 
Monte-Carlo equation of motion is 



Xf(t ± At) = Xf(t) + |(1 + C 2 ) ft At + (aVAt + 0(At 3/2 ) 



(7) 



In the left frame of Fig. 6 below the time-convergence study presented in Fig. 5 
above is repeated using the higher-order approximation of Eq. (7). In the right 
frame exactly the same experiment is repeated with ten times more particles 
to further reduce the statistical error. Somewhat surprisingly, the forward so- 
lution does not converge at all. The backward solution, however, exhibits the 
faster time convergence we had hoped for (the slope is < 3/2). 





— 5.0 



log At 
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log At 



- 1 .o 



Fig. 6. The error as a function of the time step (At) with the higher-order forward 
(blue) and backward (red) methods with two hundred thousand particles (left frame) 
and two million particles (right frame). 
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Milshtein, who was not explicitly concerned with solving parabolic equations, 
did investigate the scaling of the error of expectation values similar to the 
ones of interest here [5]. His results are consistent with the O(At) error of the 
forward and backward lower-order solutions. However, for the higher-order 
methods introduced in this section, Milshtein's results would still indicate a 
O(At) error for both the forward and backward solutions, in clear disagree- 
ment with the scalings of Fig. 6. We hope to be able to resolve the discrepancies 
in future work. 

5 Summary 

We have shown that the backward Monte-Carlo method works as expected, 
i.e. it dramatically reduces the statistical error in situations where the solution 
is sought only in a small part of phase space. Furthermore, even in cases where 
we solve for the global solution, the backward method removes the conflict be- 
tween resolution and statistics. This has great practical significance e.g. when 
the gradient of the solution is of interest. 

We have also shown how a remarkably simple modification of the backward 
Monte-Carlo equation of motion leads to improved time convergence. 
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A A new perspective on the forward Monte-Carlo method 



The aim of this appendix is to derive the conventional, forward, weighted 
Monte-Carlo method in the same manner as the backward method was de- 
rived [1]. The starting point is the forward SDE: 

dX~* = pdt + adW . (A.l) 

Solving this SDE is trivial; following exactly the same procedure as for the 
backward SDE, we get 

X^(t + At) = X^(t) + fiAt + + O(At) , (A.2) 

where ( is a zero-mean, unit- variance Gaussian random number, ( e N(0, 1). 
When the SDE is backward, the obvious initial condition is X i ~(t = T) = x. 
For the forward SDE (A.l), we simply do not know which initial condition to 
impose; we will have to leave X~*(t = 0) undefined for now. 



As before [1], we use the Ito formula to differentiate f(X(t),t) and obtain 

f(X^(T),T) = f(X^(0),0)+ f T 2f t dt+ f T af x dW , (A.3) 

JO JO 

where again we have identified \x = dD/dx, a = V2D, and used 
ft + l^fx + \o~ 2 fxx = [Eq. (1)] = 2f t . Using the macroscopic initial condition, 
f(x, 0) = $(x), we get 

f(X~*(T),T) = $ (X^(0)) - £ af x dW . (A.4) 

At this stage of the derivation of the backward method, we used the micro- 
scopic initial condition and found that the equivalent of the LHS of Eq. (A.4) 
was in fact a solution to Eq. (1): f(X^(t = T),T) = [X^(t — T) = x] 
= f(x,T). But here, in the forward derivation, we cannot do that because 
X~*(T) 7^ x\ Going forward in time, X~*(T) is an unknown; we could use the 
Monte-Carlo equation of motion, Eq. (A.2), to find X~*(T), but we do not 
have an initial condition! So, is there any way to get f(x,T) from Eq.(A.l)? 
The answer is yes; there is a (rather contrived) way. 

If we let f(x,T) be a distribution, we can write it as 

f(x,T) = E[f\X-(T),T) x 5{x-X-{T))] . (A.5) 
Substituting Eq. (A.4) into Eq. (A.5), we get: 

f(x,T) = E[*(X^(0)) x 5(x-X^(T))\ . (A.6) 
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The numerical approximation of this expectation value is the forward weighted 
Monte-Carlo solution: 

N 

Ux,T) = N-^HXr(0)) x 6(x-Xr(T)) . (A.7) 
1=1 

Now, we still need to know what the microscopic initial condition on Xf*(0) 
should be. The best we can do is to make sure that f-^(x, T — 0) $(x). Note 
that "approximately equal" must be given a very liberal definition because 
f-+(x, T — 0) is a jagged sum of (^-functions, whereas <&(#) is in general a smooth 
function. The stochastic variables Xf*(0) can now be found by following the 
stochastic trajectories given by the forward Monte-Carlo equation of motion, 
Eq. (A. 2), with the "fuzzy initial condition": 

X?(0) E {Xf (0) | Ux,T = 0) « . (A.8) 



B A higher-order Monte-Carlo equation of motion 

The diffusive term (a(X(t))VAt of the Monte-Carlo equations of motion, 
Eqs. (3) and (5), is an approximation of the Ito integral 

rt+At ft+At 

/ a(X(s))dW(s) = a(X(t)) dW(s) + 0(At), (B.l) 

Jt Jt 

where X(s) is the stochastic process that solves the SDE 

dX = fi(X(s)) ds + a(X(s)) dW(s) , (B.2) 

where W is a Wiener process, and we impose the initial condition X(t) = x t . 
The approximation Eq. (B.l) replaces a(x) with its zero-order Taylor expan- 
sion, <7o(x) = a(x t ). To reduce the error in Eq. (5) to 0(At 3/2 ) we Taylor 
expand a(x) to first order, 

CTi(x) = (r(x t ) + cr\x t )(x - x t ) , (B.3) 

and approximate the solution to Eq. (B.2) with 

X(s) = x t + v(x t )(s -t) + a(x t )[W(s) - W{t)} . (B.4) 

The Ito integral over a becomes: 
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/ a(X(s))dW{s) = a(x t ) dW(s) + 
Jt Jt 

rt+At 

a\x t ) J {fx(x t )(s -t) + a(x t )[W(s) - W(t)}} dW(s) + 0(At 3 / 2 ) = 

/t+At rt+At 
dW(s) + a(x t ) a\x t ) j [W(s) - W(t)} dW(s) + 0(At 3 / 2 ) 



(B.5) 



where fi(x t ) f{ +At (s-t) dW(s) is C(At 3/2 ) and hence was neglected. The first 
term on the RHS of Eq. (B.5) is just the low-order approximation Eq. (B.l). 
The second term is the next-order correction. To calculate this correction term, 
we first need some intermediate results. 



Following Bjork [7], we introduce the stochastic variables 

4 = E W (tk) [W(tk+i) - W{t k )\ , (B.6) 

3=0 

and 

B n = J2W(t k+1 )[W(t k+1 ) - W{t k )\ , (B.7) 

3=0 

where tj = j t/n , j — 0, 1, . . . , n — 1. It trivially follows that 

n-1 

B n + I n = J2[W(t k +i) + W(t k )][W(t k+1 ) - W(t k )\ = W\t) , (B.8) 

3=0 

and 

n-1 

B n - I n = , £[W(t k+1 ) - W(t k )] 2 = S n (t) . (B.9) 

3=0 

In Appendix A of Ref. [1] we showed that 

HmS n (()=i. (B.10) 

n^oo 

Subtracting Eq. (B.9) from Eq. (B.8) and taking the limit n — > oo, we obtain 
Km I n {t) = [ t W(s)dW(s) = \[W 2 (t) -t] . (B.ll) 

J 

Applying Eq. (B.ll) to Eq. (B.5) we get 
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rt+At 

/ a (X(s)) dW{s) = a(X(t)) [W(t + At) - W(t)] + 

\ a(X(t)) a'(X(t)) {[W{t + At) - Wit)} 2 - At} + G(At 3 / 2 ) . 

(B.12) 

In the absence of advection, as is the case for Eq. (1), oa 1 = //, and the 
higher-order Monte-Carlo equations of motion become 

Xf(t ± At) =Xf (t) + fiAt + (aVAl + |(C 2 - 1)//At + £>(At 3/2 ) = 

Xf (t) + |(1 + C 2 )Mt + C^v/At + G(At 3 / 2 ) , (B.13) 

where we have used the fact that Wit + At) — Wit) G iV(0, v / At); C is a zer °- 
mean, unit-variance Gaussian random number; £ G N(0, 1); and /i = fj,(X(t)) 
and cr = (x(X(t)). 

In Ref. [4] Milshtein derived Eq. (B.12) by introducing 
X*(t + At) =x t + Cl At + c 2 [W{t + At) - W{t)\ + c 3 [W(t + At) - W{t)f , 
and finding the c ± , c 2 and c 3 that minimize E[(X*(t + At) - X(t + At)) 2 ]. 
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